Identifying gene-disease associations using centrality on a literature mined gene-interaction network

نویسندگان

  • Arzucan Özgür
  • Thuy Vu
  • Günes Erkan
  • Dragomir R. Radev
چکیده

MOTIVATION Understanding the role of genetics in diseases is one of the most important aims of the biological sciences. The completion of the Human Genome Project has led to a rapid increase in the number of publications in this area. However, the coverage of curated databases that provide information manually extracted from the literature is limited. Another challenge is that determining disease-related genes requires laborious experiments. Therefore, predicting good candidate genes before experimental analysis will save time and effort. We introduce an automatic approach based on text mining and network analysis to predict gene-disease associations. We collected an initial set of known disease-related genes and built an interaction network by automatic literature mining based on dependency parsing and support vector machines. Our hypothesis is that the central genes in this disease-specific network are likely to be related to the disease. We used the degree, eigenvector, betweenness and closeness centrality metrics to rank the genes in the network. RESULTS The proposed approach can be used to extract known and to infer unknown gene-disease associations. We evaluated the approach for prostate cancer. Eigenvector and degree centrality achieved high accuracy. A total of 95% of the top 20 genes ranked by these methods are confirmed to be related to prostate cancer. On the other hand, betweenness and closeness centrality predicted more genes whose relation to the disease is currently unknown and are candidates for experimental study. AVAILABILITY A web-based system for browsing the disease-specific gene-interaction networks is available at: http://gin.ncibi.org.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ignet: A Centrality and INO-based Web System for Analyzing and Visualizing Literature-mined Networks

Ignet (Integrative Gene Network) is a web-based system for dynamically updating and analyzing gene interaction networks mined using all PubMed abstracts. Four centrality metrics, namely degree, eigenvector, betweenness, and closeness are used to determine the importance of genes in the networks. Different gene interaction types between genes are classified using the Interaction Network Ontology...

متن کامل

Using the Protein-protein Interaction Network to Identifying the Biomarkers in Evolution of the Oocyte

Background Oocyte maturity includes nuclear and cytoplasmic maturity, both of which are important for embryo fertilization. The development of oocyte is not limited to the period of follicular growth, and starts from the embryonic period and continues throughout life. In this study, for the purpose of evaluating the effect of the FSH hormone on the expression of genes, GEO access codes for this...

متن کامل

Evaluation of Protein Complexes in Muscular Atrophy Using Interaction Map Analysis

Background and purpose: Muscular atrophy is a condition derived from different diseases and aging. Molecular study of the disease condition can help in developing diagnostic methods and treatment approaches. In this study, protein interaction network was analyzed to understand molecular events at protein levels. Materials and methods: In this experimental study, the network was constructed and...

متن کامل

Mining of vaccine-associated IFN-γ gene interaction networks using the Vaccine Ontology

BACKGROUND Interferon-gamma (IFN-γ) is vital in vaccine-induced immune defense against bacterial and viral infections and tumor. Our recent study demonstrated the power of a literature-based discovery method in extraction and comparison of the IFN-γ and vaccine-mediated gene interaction networks. The Vaccine Ontology (VO) contains a hierarchy of vaccine names. It is hypothesized that the applic...

متن کامل

Investigating the effects of ibuprofen on the gene expression profile in Hippocampus of mice model of Alzheimer’s disease through bioinformatics analysis

Non-steroidal anti-inflammatory drugs (NSAIDs) identified effective in many diseases. One of which is neurodegenerative diseases including Alzheimer disease (AD). In this study gross alteration of gene expression in AD mice by ibuprofen treatment is investigated via Protein-protein interaction network (PPI) analysis. Expression profiling of microarray dataset GSE67306 was retrieved from GEO dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2008